Distributed Computing: Moving from CGI to CORBA
نویسندگان
چکیده
In this paper, we document the evolution of a banner ad delivery system from a simple CGI script written in Perl running on a single host into a distributed computing application using CORBA. While CORBA has an established history in the enterprise-computing world, it is only recently that the OpenSource® community has begun to embrace it. Starting without any RPC programming experience, it took TargetNet a little less than half a year to integrate CORBA into the Apache web server and convert all their CGI programs into CORBA servers. Performance of the system increased from 50 transactions per second to over 400 per second. Thanks to the crossplatform capabilities of CORBA, future components can be developed on virtually any operating system and programming language. By adding inexpensive servers, the capacity of the system scales in a near-linear fashion. Most importantly, the switch to CORBA didn’t require a change of operating system or development environment – everything runs on a free operating system using OpenSource components. Introduction In 1997, TargetNet built and deployed “The Datacom Ad Network”, a banner ad delivery system. The CGI script that selected and delivered ads was written in Perl, ran on a Pentium 200 machine, and could deliver just one ad per second. Rewriting in C and migration to faster processors took performance to 10 ads per second, then 30, then 50. Further optimization proved futile: we had reached the CGI performance barrier. Increasing performance by adding processors or hosts was not feasible: the architecture of the existing delivery system was limited to running on a single host, and was single-threaded. Worse, the code used a large number of flat files on disk, and so spent a large percentage of its time performing system calls or waiting for file locks. Increasing processor power provided a brief respite, but we could not afford to upgrade server hardware forever. To remain competitive, ad delivery performance needed to increase to roughly 400 ads per second plus allow for multiple hosts to share the load of the network. It was clear that a new architecture was required that could overcome the limitations of standard CGI scripts. The base requirements of the new system were laid down before any research began. The new system would have to: offer single host performance several times that of the existing CGI script. utilize a distributed computing model without arbitrary limits on the number of hosts. allow multiple hosts to share the network load, preferably with load balancing and redundancy. be called from a web browser like a CGI script but without the inherent limitations of CGI. scale to handle anticipated growth over the next four to five years without major architectural changes. allow gradual integration of commercial hardware and software without massive re-coding. Most importantly, the system had to be cost effective. Our server platform was Apache running on FreeBSD, and all software in use was either freely available or developed in-house. While commercial solutions to our problem existed (Oracle Parallel Server running on a commercial UNIX), financial constraints dictated that we find a free solution or develop one in-house. Breaking the CGI speed barrier The problem of CGI performance is not new. Over the last several years, many solutions that remove the forked-CGI bottleneck from web applications have come to light. CGI enhancement wrappers like [FASTCGI] allow an existing CGI script (especially those written in interpreted languages like Perl) to run much faster and remain resident between invocations. Still, these wrappers extend the existing CGI standard, sacrificing flexibility for compatibility. They are limited, naturally, to tasks that you would usually use a CGI script for. Other client/server tasks need to be addressed separately. Integrating the functionality of a CGI script directly into the web server provides the benefits of a CGI wrapper, and gives developers access to the internals of the web server. Not only does this technique share the same limitations as CGI wrapper toolkits, but developers have to lock themselves into a particular web server architecture, choosing to develop an ISAPI, NSAPI, or Apache module. Similarities between these architectures are few: moving a complex module from one web server product to another could require a complete rewrite. Clustered computing solutions promise transparent scalability simply by adding hosts. Unfortunately, most clustering systems are commercial, costly, and tied to a particular line of hardware (though there are alternatives, such as Beowulf clusters running on Linux). Using commodity hardware would have addressed the cost issue, but would have required us to switch from FreeBSD to Linux. In doing so we would be giving up the significant investment we already had in FreeBSD servers and knowledgeable personnel. In short, we felt that the immediate benefits of clustering were outweighed by the commitment that one has to make to a particular vendor and/or operating system. Remote Procedure Call (RPC) solutions do not share the aforementioned limitations. Most RPC solutions are available for multiple operating systems and hardware. They are abstracted from the application layer, and do not require adherence to one vendor’s API. As it is not directly tied to the CGI model, RPC can be used to replace traditional client/server applications as well. The only issue surrounding RPC is which architecture to use. ONC RPC, developed by Sun Microsystems, is already in wide use on UNIX® systems. ONC RPC is at the heart of NIS, NIS+ and NFS. The limitations of the original ONC RPC have been documented and exploited for as long as they have been in use. ONC RPC+ addresses many of these limitations and provides for encrypted communication but is not as widely available as the original implementation. The Distributed Computing Environment (DCE) from the Open Group provides almost every distributed computing tool one could need, but is as complex as it is complete. Suited best for large projects, the administration of DCE can be a monumental task. Only one free implementation of DCE is available, limiting improvement through vendor competition. We found many of the elements we required in RPC, but we did not find an implementation that provided all of them in one package. We attempted to build our own middleware, without much success. The issues that undoubtedly plagued the developers of ONC RPC and DCE proved too much for our small team of developers. Returning to the research arena, we began to look at CORBA. We had dismissed CORBA early in the design phase, believing it to be geared towards enterprise computing and unsuitable for our use. An in-depth examination showed promise. CORBA offered everything that we were looking for: unlimited cross-platform capability, several free implementations, and an aggressive development model that promises to keep the technology alive in the future. As discussed in [MODZ97], CORBA also offers features not found in RPC or DCE, including interface portability, dynamic interface invoObject Implementation Dynamic Invocation Client IDL Stubs ORB Interface Object Adaptor Static IDL Skeleton Dynamic IDL Skeleton
منابع مشابه
Real-time and Embedded Distributed Object Computing Workshop
The OMG Real-time CORBA specification extends CORBA for use in real-time systems. Real-time CORBA provides a clean infrastructure for building distributed applications with time constraints. In addition, the Minimum CORBA specification offers a feature-optimized version of the CORBA specification that allows application designers to depend on the reduced feature sets of lightweight ORB implemen...
متن کاملCorbaWeb: A WWW and Corba Worlds Integration
The future of distributed client/server computing will consist of the WWW and Corba environments. First, the WWW is the user-friendly uniform interface to access any distributed resource. Second, Corba is a single uniform object-oriented view of distributed and heterogeneous systems integration. Then these two worlds need to merge to make distributed objects user-friendly. This paper rst discus...
متن کاملCORBA-as-Needed: A Technique to Construct High Performance CORBA Applications
This paper proposes a new optimization technique called CORBA-as-needed to improve the performance of distributed CORBA applications. This technique is based on the observation that in many cases the client and the server of a distributed application run on compatible computing platforms, and do not need the interoperability functionality of CORBA. CORBA-as-needed dynamically determines if the ...
متن کاملObject-oriented techniques for distributed computation
Distributed applications are currently built by means of a wide spectrum of programming and design techniques. This spectrum spans various dimensions, such a technique’s dependency on languages, protocols, or software products, its adherence to some design or programming paradigm, the nature and complexity of its intended applications. This paper surveys some interesting object-oriented techniq...
متن کاملAn Adaptive Scheduling Service for Real-Time CORBA
CORBA is an important standard middleware used in the development of distributed applications. It has also been used with distributed real-time applications, through its extension for real-time systems, RT-CORBA. RT-CORBA includes many mechanisms to reduce the non-determinism associated with ordinary CORBA. These mechanisms can be used to provide guarantees for hard real-time systems if the rig...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000